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A METHOD OF DECODING DATA ENCODED IN A MONOCHROME 



MEDIUM 

CROSS REFERENCE TO RELATED APPLICATIONS 

Reference is made to commonly-assigned copending U.S. Patent 
5 Application Serial No. 10/000,407, filed November 2, 2001 , entitled DIGITAL 
DATA PRESERVATION SYSTEM, by Wong et al.; and U.S. Patent Application 
Serial No. 10/059,994, filed January 29, 2002, entitled A METHOD OF 
ENCODING DATA IN A MONOCHROME MEDIA, by Abhyankar et al., the 
disclosures of which are incorporated herein. 
10 FIELD OF THE INVENTION 

This invention generally relates to a method for long-term 
preservation of data and more particularly relates to the decoding of data 
preserved with an image on monochrome media. 

BACKGROUND OF THE INVENTION 
15 In spite of numerous advances in development and use of color 

imaging media, there are a number of conditions in which monochrome imaging 
media must be used. For example, archival or long-term preservation of images 
may require that images be stored on a monochrome media. As another example, 
there can be advantages to compact storage of images, where it is desirable to use 
20 a monochrome media for preserving a color image, with accompanying encoded 
information. 

There can be a considerable amount of data associated with an 
image, where the data concerns the image itself. For example, in printing 
applications, information about an image can include color separation data for 

25 corresponding cyan, magenta, yellow, and black (CMYK) inks or other colorants. 
Typically, color separations can be stored as separate images on monochrome 
media, so that each color separation is then stored as a separate monochrome 
image. For example, U.S. Patent No. 5,335,082 (Sable) discloses an apparatus 
using a plurality of monochrome images as separations of a composite color 

30 image. Similarly, U.S. Patent No. 5,606,379 (Williams) discloses a method for 
storing color images on a monochrome photographic recording medium in which 
separate R, G, and B or lightness and chroma channels are stored as separate 
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images. Such methods may be acceptable for some types of storage 
environments; however, it can be appreciated that there would be advantages in 
storing fewer images and in providing a more compact arrangement. 

A number of existing methods for encoding data associated with an 
5 image are directed to the problem of encoding color image information within a 
monochrome image. Examples of solutions for this type of image-data encoding 
include the following: 

U.S. Patent No. 5,557,430 (Isemura et al.) discloses a method 
for processing a color image in order to encode color 
10 recognition data on a resulting monochrome image. The 

method described in U.S. Patent No. 5,557,430 provides some 
amount of color information available; however, such a method 
is usable only in limited applications, such as where only a few 
spot colors are used on a document, such as a business 
15 presentation. 

U.S. Patent No. 5,701,401 (Harrington et al.) discloses a 
method for preserving the color intent of an image when the 
image is printed on a monochrome printer. Distinctive patterns 
are applied for each color area. 
20 U.S. Patent No. 6,179,485 (Harrington) discloses a method for 

encoding color information in monochromatic format using 
variously stroked patterns. This method is primarily directed to 
preserving color intent for fonts and vector (line) drawings. 
Similarly, U.S. Patent No. 6,169,607 (also to Harrington) 
25 discloses methods for encoding color data in monochrome text 

using combinations of bold, outline, and fill pattern effects. 
U.S. Patent Nos. 4,688,031 and 4,703,318 (both to Haggerty) 
disclose methods for monochromatic representation of color 
using background and foreground patterns. 
30 Overall, the methods disclosed in U.S. Patent Nos. 5,557,430; 

5,701,401; 6,179,485; 4,688,031; and 4,703,318 may provide some color 
encoding that is useful for documents using a very limited color palette, such as 
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business documents and charts. However, these methods would be unworkable 
for a full-color image, where the need for a pixel-by-pixel encoding would require 
considerably greater spatial resolution than these methods provide. At best, such 
methods may be able to provide a rudimentary approximation of color using 
5 relative lightness levels. However, there is no provision in any of the schemes 
given in the patents listed above for encoding of additional data related to the 
color image when it is represented in monochrome format. 

Known methods used for encoding data associated with an image 
include that disclosed in U.S. Patent No. 5,818,966 (Prasad et al.), which discloses 

1 0 encoding color information along a sidebar that prints with a monochrome version 
of a document. This solution would have only limited value, such as with charts 
and other business graphics using a palette having a few colors. 

Each of the solutions noted above is directed to encoding data 
about the image itself, such as color data. However, it may be useful to encode 

1 5 other types of data that, although not directly concerned with image representation 
itself, may be associated with an image. For example, an image can have 
associated audio data, animation data, measurement data, text, or other data, 
where it is advantageous to have such data coupled in some manner with the 
image. Use of a sidebar, such as disclosed in U.S. Patent No. 5,818,966 provides 

20 some solution; however, such a solution requires additional media area that may 
not be inherently coupled to an image. Because most images are stored in a 
rectangular format, any additional patch of information must be stored above, 
below, or on either side of the image. Accompanying information would take up 
additional space on the media. In addition, any encoded information provided in a 

25 separate area of the storage medium could be intentionally or unintentionally 
separated from the image itself. 

Methods for encoding data in visible form on a monochromatic 
medium include the following: 

U.S. Patent No. 5,091,966 (Bloomberg et al.) discloses the use 

30 of monochromatic glyph codes encoded onto a document 

image, in visual juxtaposition to the image. Notably, the area 
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in which the glyph codes are encoded is separate from the 
document image itself with this solution. 
U.S. Patent Nos. 6,098,882 (Antognini et al.) and 4,939,354 
(Priddy et al.) disclose methods for encoding digital data onto 
5 paper in compact form using bi-tonal markings grouped in a 

spatial array of cells. The ability to provide increasingly more 
compact data storage on monochrome media, using methods 
such as those disclosed in U.S. Patents No. 6,098,882 and 
4,939,354, can be attributed, in large part, to continuing 
10 improvement in the spatial resolution of desktop scanners. 

U.S. Patent No. 5,278,400 (Appel) discloses a method for 
encoding data in a cell comprising multiple pixels, where the 
halftone gray level of each individual pixel, in combination 
with other pixels within the cell, encodes a data value for the 
1 5 cell. The method disclosed in U.S. Patent No. 5,278,400 also 

takes advantage of increased spatial resolution of scanners, 
supplemented by the capability of a scanner to sense gray level 
at an individual pixel within a cell. 
The methods disclosed in U.S. Patent Nos. 5,278,400, 6,098,882, 
20 and 4,939,354 provide data encoding for compact data storage on a monochrome 
medium. However, neither these methods nor the methods disclosed in the 
patents cited above provide a mechanism for integrally coupling data to an 
associated image. These methods also require space on the monochrome medium, 
in addition to that required for the image itself. 
25 Some types of monochrome media, such as paper, for example, 

allow reproduction of only a limited range of perceptible densities. That is, only a 
few different density levels can be reliably printed or scanned from such types of 
media. However, there are other types of monochrome media that have 
pronouncedly greater sensitivity. Conventional black and white photography film, 
30 for example, is able to faithfully and controllably reproduce hundreds of different 
gray levels, each measurably distinct. Other specialized films and photosensitive 
media have been developed that exhibit wider overall dynamic range and higher 
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degrees of resolvable density, able to produce a higher number of distinct 
grayscale values. 

It is instructive to observe that the term "grayscale" is 
conventionally associated with a range of densities where the monochromatic 
5 color hue is black. However, for the purposes of this application, the 

monochromatic color hue, or color base, for a grayscale image need not be black, 
but could be some other color. For example, some types of monochrome film 
have a very dark blue color hue that could be used as the color base for grayscale 
imaging. Regardless of the precise color hue, the term "grayscale" as used herein 

1 0 relates to a range of measurable density values of a single base color, formed at 
individual pixel locations on a digital preservation medium. 

It is instructive to note that the human viewer perceives only a 
limited number of grayscale gradation values, centered on a range that is well 
within the overall dynamic range of most types of photosensitive media. 

1 5 Generally, a bit depth of 8-bits is sufficient for storing the grayscale values 

perceptible in monochrome images. While, for human perception, there may be 
no need for visible representation exceeding a bit depth of 8-bits, it could be 
possible to reproduce an image having a larger bit depth, with 10, 12, or greater 
bits of resolution, for example, using photosensitive media described above. In 

20 fact, many conventional scanners have additional sensitivity for grayscale 

resolution. The four-color printing industry, for example, uses high-resolution 
color scanners that are able to provide very high spatial resolution and very 
sensitive color resolution. As just one example, the SG-8060P Markll High-end 
Input Scanner from Dainippon Screen claims to be capable of scanning at 12,000 

25 dpi and providing 48-bit RGB resolution. Anticipated improvements in scanning 
technology are expected to make the capability for such high resolution and high 
density sensitivity more readily accessible and more affordable. This would 
mean, for example, that a scanner could have sufficient sensitivity to provide data 
with a bit depth exceeding 8-bits when scanning a highly sensitive media, even 

30 though 8-bit grayscale representation is sufficient for storing an image in human- 
readable form. 



Conventionally, in converting a full-color image to a monochrome 
format only the relative lightness or darkness value of a color is used to determine 
a corresponding grayscale representation. Chroma information, which indicates 
color hue content, is largely ignored. For this reason, restoration of original color 
information to an image, once converted to monochrome format, is not easily 
feasible. It can be appreciated that image storage solutions that preserved some 
color information, even if approximate, could be advantageous. 

Thus it can be seen that conventional document storage and 
preservation solutions fall far short of meeting the need to integrally couple data 
related to an image to the image itself. Even though the capability exists for 
reproducing and measuring image density sensitivity well in excess of the human- 
perceptible range, no use has been made of this excess capability for its data 
storage potential. 

SUMMARY OF THE INVENTION 

It is an object of the present invention to provide a method of 
decoding, from a rasterized image formed on a monochrome medium, a plurality 
of data values encoded within each pixel, the method comprising, for said each 
pixel, the steps of: 

(a) scanning said each pixel to obtain a monochrome density 
value having a predetermined bit depth; 

(b) decomposing said monochrome density value to obtain a 
first data field and a second data field; 

(c) decoding said first data field in order to obtain a first data 
value; and 

(d) decoding said second data field in order to obtain a second 
data value. 

It is a feature of the present invention that it allows a coupling of 
data associated with a document to the document itself, in such a way that the 
coupled, encoded data is not easily separable from the image of the document, but 
does not obscure the image. At the same time, the coupled data can be encoded in 
a manner that is imperceptible, while the document itself is visible. The method 
of the present invention allows a document and its associated encoded data to be 
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preserved on a monochrome preservation medium, available for future access and 
decoding. 

The present invention takes advantage of the high levels of 
resolvability available with some types of monochromatic media. High- 
5 resolvability allows encoding of data in gray levels, where the number of gray 
levels that can be reproduced and detected exceeds the number of distinct gray 
levels that can be distinguished by the human eye. 

It is an advantage of the present invention that it provides a method 
for long-term preservation of a document and its associated data as a single unit. 

10 It is yet a further advantage of the present invention that it provides 

a method for preserving, onto a monochrome medium, data about a full-color 
image. This includes, for example, tristimulus color data based on standard 
L*a*b* color space, on hue-saturation-brightness color space, on RGB color 
space, or on CMY color space. 

15 It is yet a further advantage of the present invention that it provides 

a method for storing metadata associated with a document or with document 
image processing in a manner such that the metadata is closely coupled or, in 
some embodiments, integrally coupled to the document. 

It is yet a further advantage of the present invention that it provides 

20 a method for storage of data having considerable density, yet without making 

existing equipment obsolete. That is, existing image sensing apparatus may not be 
able to take advantage of denser data encoding capabilities offered by the present 
invention, but can still be used for scanning an image preserved using these 
techniques, for example. For images, higher order density values typically store 

25 the lightness channel information, so that, as a baseline, an image remains human- 
readable even if it contains considerable additional data content. 

It is yet a further advantage of the present invention that it allows a 
provider of document preservation services to offer its customers variable levels 
of information decoding, so that only needed portions of encoded data are restored 

30 in response to a customer request. 

These and other objects, features, and advantages of the present 
invention will become apparent to those skilled in the art upon a reading of the 



following detailed description when taken in conjunction with the drawings 
wherein there is shown and described an illustrative embodiment of the invention. 
BRIEF DESCRIPTION OF THE DRAWINGS 
While the specification concludes with claims particularly pointing 
5 out and distinctly claiming the subject matter of the present invention, it is 
believed that the invention will be better understood from the following 
description when taken in conjunction with the accompanying drawings, wherein: 

Figure 1 is a block diagram showing an arrangement of 
components used to preserve a document along with its associated encoded data; 
10 Figure 2 is a flow chart illustrating key steps in processing for 

document preservation with associated encoded data; 

Figure 3 is a visual representation of a data word having multiple 
data fields, each data field having a predetermined bit depth; 

Figure 4 is a graph showing a typical relationship of density to the 
1 5 logarithm of exposure energy for a typical photosensitive medium, indicating 
separate density ranges of interest; 

Figure 5 is a visual representation of a mapping operation for 
correlating data fields within a data word of a larger bit depth to data fields within 
an 8-bit byte; 

20 Figure 6 is a visual representation of an 8-bit byte used in a 

mapping operation such as that illustrated in Figure 5; 

Figure 7 is a plane view showing one possible layout arrangement 
for a preserved document record; 

Figure 8 is a plane view showing a metadata record and calibration 
25 strip on a media roll; 

Figure 9 is an example data listing for metadata information 
applicable to a media roll, cassette, or other unit; 

Figures 10a through lOd show an example structure and data fields 
for metadata information applicable to a preserved document record; 
30 Figure 1 1 is a flow chart illustrating key steps for decoding 

encoded data according to the present invention; 



Figure 12 is a block diagram showing an arrangement of 
components used to decode preserved document data that has been encoded in an 
image; and 

Figure 13 is a visual representation of a monochrome data word 
having an additional data field in an alternate embodiment. 

DETAILED DESCRIPTION OF THE INVENTION 

The present description is directed in particular to elements 
forming part of, or cooperating more directly with, apparatus in accordance with 
the invention. It is to be understood that elements not specifically shown or 
described may take various forms well known to those skilled in the art. 

Referring to Figure 1, there is shown a preservation system 80 for 
accepting an input document and its associated data, encoding the data, and 
writing the rasterized image and data encoding onto a monochrome preservation 
medium to generate a preserved document record 90. A control processing unit 
88, typically a computer workstation, accepts an input document in electronic 
form from any of a number of possible sources. One input source could be a 
networked graphics workstation 82. Alternately, an input document could be 
from a printed page 84, photograph, or other printed image that can be converted 
to electronic form by a scanner 86. Other possible document sources could 
include, but are not limited to, digital camera images, Photo CD images, on-line 
image archives, computer-generated images such as from CAD and graphics 
design software packages and multimedia software packages, document 
processing systems, and imaging instruments, for example. Documents could 
include data files of many types, including web pages, spreadsheets, email, 
electronic files from programs such as Microsoft Word, PowerPoint, Excel, and 
the like. 

Control processing unit 88 accepts the document data from any 
suitable source and formats image data into a rasterized form suitable for a printer 
92. In rasterized form, the document is converted into one or more images. Each 
rasterized image comprises a two-dimensional array of pixels, with each pixel 
having an assigned value, such as a tristimulus color value, for example. In 
addition, control processing unit 88 may also format, encode, and rasterize 
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additional data or metadata to be associated with the document and to be imaged 
along with the document onto preserved image record 90. This additional data or 
metadata may be provided by software that executes on control processing unit 88 
itself or may be provided from graphics workstation 82 or from some other data 
5 source. This data or metadata could include information entered by a user or 
customer of preservation system 80. 

Monochrome Preservation Media for Images and Encoded Data 

Examples of suitable human-readable preservation media for 
imaging by preservation system 80 include microfilm and related film products 
1 0 and other types of media having similar long-life expectancy and excellent image 
stability. In addition to film-based media, some other media types that may be 
acceptable, in some form, for use as human-readable preservation media include 
the following: 

(a) electrophotographic media, when properly treated and 
15 finished; 

(b) thermal media, such as thermal dye sublimation media; 

(c) inkjet media, particularly using plastic film or reflective 
materials; and 

(d) metal plate materials, written using methods such as etching 
20 and laser ablation. 

The materials that are used for human-readable preservation media 
are characterized by exceptionally long useful life. This is in contrast to 
conventional binary storage media, such as magnetic tapes or disks or optical 
storage media. These conventional media types are not readable to the human 

25 eye, whether aided by magnification or unaided, and are not suitable for reliable 
long-term data storage due to their relatively short lifespan and due to hardware 
and software dependencies for data access from these media. For example, 
changes to operating system, CPU, or application software can render data that 
has been recorded on binary storage media to be unusable. By contrast, data 

30 recorded in human-readable form on preservation media can still be interpreted, 
regardless of changes to CPU, operating system, or application software. 
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Preservation media are typically packaged and provided in some 
form capable of holding multiple records or frames. Typical formats include roll, 
cassette, or cartridge format. Preferably, the preservation medium exhibits a 
sufficient, controlled dynamic range that allows representation of many more 
individual grayscale levels than are distinguishable to the human eye. The 
potential excess capability of high-quality monochrome media, such as, for 
example, KODAK Film SO-240 produced by Eastman Kodak Company, 
Rochester, New York, makes it possible to utilize media of this type for encoding, 
into image pixels, related data that is associated with that image. 
Stages in Document Processing 

As the above description suggests, any of a number of types of 
data, including metadata, can be encoded for preservation on a monochrome 
medium along with the rasterized image of a document. A few of the numerous 
types of data that might commonly be preserved with an image include color data, 
audio, measurement, and animation data, for example. For the purpose of initial 
description, the processing sequence for preservation of document data that is 
described with reference to Figures 2 through 1 1 below uses, as an illustrative 
example, the encoding and preservation of tristimulus color data associated with 
an image document. Following the description for this type of encoding, the 
discussion of this specification then broadens its scope to encompass more general 
cases of encoding and decoding associated data. 

Referring then to the flow chart of Figure 2, there is shown a 
processing sequence for encoding document data to a monochrome medium. As 
was described above, an input file in electronic form is provided to this process; in 
the preferred embodiment, the input file includes a color image. A rasterization 
step 200 formats the input file to a rasterized, pixel format, where each pixel has 
an associated raster value. In the preferred embodiment, this raster value is a 
tristimulus color image value using CIELAB color space, with component values 
of lightness (L% a-chroma (a*) and b-chroma (b*). A counter initialization step 
202 and a counter increment step 204 are provided to illustrate the mechanics of 
looping operation for processing each image pixel. For each pixel, a monochrome 
word assignment step 206 assigns a word for storing encoded values for grayscale 
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representation. The assigned monochrome data word has a predetermined bit 
depth that is a factor of the density resolution of the preservation medium, the 
density-marking characteristics of printer 92, and the performance characteristics 
of an intended scanning device for scanning and extracting encoded data at some 
future time. The data word is itself partitioned into a first data field, a second data 
field, and possible third and subsequent data fields. In the preferred embodiment, 
the monochrome data word has first, second, and third data fields for encoding 
lightness, a-chroma, and b-chroma values respectively. For each pixel an 
encoding step 208 is then executed. In encoding step 208, the first component 
value, lightness L* in the preferred embodiment, is encoded in a first data field of 
the monochrome data word. A second value is then encoded in a second data field 
of the monochrome data word. This is the a-chroma value a* in the preferred 
embodiment. The b-chroma value b* is then encoded in a third data field of the 
monochrome data word in the preferred embodiment. However, other types of 
data could alternately be encoded into the second, third, and subsequent data fields 
as part of encoding step 208. A number of data representation schemes can be 
employed for encoding additional values to additional data fields of the 
monochrome data word. At the conclusion of encoding step 208, a grayscale 
forming step 210 is then executed. In grayscale forming step 210, the various data 
fields in the monochrome data word are used to generate a grayscale value for 
imaging the pixel. The monochrome data word can be used without any 
modification; alternately, its fields can be concatenated or otherwise combined in 
some other order. In an imaging step 212, then, the pixel can be formed by printer 
90 with the intended grayscale value generated in grayscale forming step 210. 
Finally, a looping decision step 214 determines whether or not each pixel has been 
assigned its grayscale value. 

Those skilled in the computing arts can readily recognize that the 
flow chart of Figure 2 illustrates only one possible implementation of image 
encoding and printing using a loop, using the mechanics of steps 202, 204, and 
214. Alternate logic flow sequences could be used. In practice, imaging step 212 
would most likely write the data for pixel„ into an intermediate memory buffer or 
similar structure, so that a complete image could be sent to printer 90 at one time. 
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Regardless of the exact processing mechanics, however, the basic assignment and 
value mapping scheme outlined in steps 200, 206, 208, and 210 of Figure 2 would 
be carried out in some fashion in order to implement the method of the present 
invention. 

As shown in Figure 3, for most standard tristimulus color imaging, 
the input file is encoded in a 24-bit raster value 100. A preprocessing step may be 
needed to convert color image data into a suitable format such as that represented 
in Figure 3. One common color image format uses the familiar CIE 1976 L*a*b* 
or CIELAB color space of the CIE, Commission Internationale de l'Eclairage 
(International Commission on Illumination), well known to those skilled in the 
color imaging arts. For the CIELAB format, there are three channels of 
information: lightness (*L), a-chroma (a*) and b-chroma (b*). Each channel of 
information uses 8-bits, so that a complete 24-bit word is needed to express the 
CIELAB L*a*b* color space value of each image pixel, as was shown in Figure 3. 
Raster value 100 as shown in Figure 3 has a bit depth of 24 bits with 3 data 
components. A first data component 104 contains the L* channel value. A 
second data component 106 contains the a* channel value. A third data 
component 108 contains the b* channel value. 

Ideally, it would be advantageous to be able to store each 24 bit 
CIELAB L*a*b* value for each pixel. However, there are two practical 
considerations that underlie the implementation of the encoding scheme that 
follows: 

(1) limitations of the monochrome media. While it may be 
theoretically possible to accurately reproduce 10-, 12-, 14-bits 
or greater resolution on a monochrome medium, existing media 
and imaging techniques would make it very difficult to 
approach the 24-bit resolution that would be needed for full, 
lossless encoding. 

(2) limitations of human perception. With respect to 
monochrome imaging, the human eye is sensitive to a limited 
number of grayscale monochrome gradations. In practice, as 
few as 16 different grayscale levels provide monochrome 
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representations of color images that are considered visually 
accurate and pleasing. 

As the graph of Figure 4 shows for a typical photosensitive 
medium, the density response can be segmented into three overall regions. The 
human eye is most sensitive over a high contrast region 124. The photosensitive 
medium also exhibits density response over a shoulder region 122 and a toe region 
126, however, human perception is not highly sensitive within these high and low 
extremes. In conventional tristimulus color-to-monochrome mapping schemes, 
only high-contrast region 124 is used, and typically only for mapping to a 
corresponding lightness channel value. 

In light of these considerations, then, encoding step 208 of the 
present invention (Figure 2) performs a mapping from the 24 bit L*a*b* color 
space representation of raster value 100 to an 8-bit byte that serves as a 
monochrome data word 120. Referring to Figure 5, there is shown the mapping 
scheme from raster value 100 to monochrome data word 120 as used in a 
preferred embodiment. The 8-bit value in first data component 104, containing 
the L * value, is mapped to a first data field 114, which contains 4 bits. This 
mapping enables as many as 16 discrete grayscale levels to be represented for the 
lightness values of pixels in the original color image. The 8 -bit value in second 
data component 106, containing the a* value, is mapped to a second data field 
116, which contains 2 bits. Similarly, the 8-bit value in third data component 108, 
containing the b* value, is mapped to a third data field 118, which also contains 2 
bits. 

For mapping of components 104, 106, and 108 to data fields 1 14, 
116, and 118 respectively, a number of methods can be used. In the preferred 
embodiment, mapping is performed using a straightforward histogram and 
statistical techniques for mapping a large set of multiple values to a smaller set of 
representative key values, where each key value allows a reasonable 
approximation of a set of nearby larger values. For example, for actual image data 
values ranging from 18 to 23, a representative key value 20 may be chosen. 
Further encoding processes may then map key value 20 to an integer value that 
can be represented using 2 or 4 bits. Such statistical and mapping techniques, 
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familiar in the data processing arts, enable effective "compression" of image data 
so that some amount of color data that may have been originally obtained at 8-bit 
resolution can be preserved in a 2-bit or 4-bit data field of monochrome data word 
120. 

In the preferred embodiment, as is shown in Figure 5, the 2 bits for 
each a* value, and the 2 bits for each b* value in monochrome data word 120 
allow the mapping of corresponding 8-bit chroma values to the appropriate one of 
the indexed a * and b * chroma values. Similarly, using 4 bits for the L * value 
allows mapping of an 8-bit lightness value to an appropriate indexed value with 
higher resolution. 

Returning back to Figure 2, grayscale forming step 210 may be no 
more complicated than simply using, as the grayscale value, all data fields 114, 
116, and 1 18 in monochrome data word 120, plus any additional data fields into 
which monochrome data word 120 is partitioned. Optionally, depending on the 
available monochrome density resolution, customer requirements, or other factors, 
only individual data fields 1 14, 1 16, 1 18 may be used or fields 1 14, 116, and 118 
may be concatenated in any suitable combination. 

The procedure of Figure 2 is executed for all pixels in the 
rasterized document. Note that the monochrome image that prints as a result of 
the process described above with reference to Figure 2 may have the same overall 
appearance as a monochrome image produced from a color image by using only 
the lightness L* channel information. However, unlike conventional methods that 
use a relative lightness value and preserve no chroma information, the method of 
the present invention allows an indexed lightness value to be represented and 
preserves chroma information in the lower 4 bits of the 8-bit grayscale value. 
Since the lower 4 bits are not readily perceptible to the human observer, the 
information stored in these bits does not interfere with the overall appearance of 
the preserved image; however, scanning the preserved image with a high- 
resolution scanning device will allow the encoding of the lower 4 bits to be 
retrieved. 

Metadata about the Document 
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In addition to pixel grayscale values, there may be more 
information needed for re-creation of the original full-color image or needed for 
accompanying the image itself. Referring to Figure 7, there is shown an encoded 
image 96 on preserved document record 90. Below image 96 is a document 
metadata section 94. Document metadata section 94 provides, in human-readable 
form, necessary information for interpreting the document data in encoded image 
96. Information in data fields of document metadata section 94 could include any 
of the following, for example: 

(a) key values or values that occur most frequently; 

(b) color space parameters or pointers to a color palette; 

(c) metadata on bit and data field assignment for grayscale 
values; 

(d) data field concatenation scheme used; 

(e) data field mapping scheme used; 

(f) image spatial resolution, typically in dots-per-inch (dpi), and 
length and number of scan lines; and 

(g) bit depth of grayscale pixels. 

Registration marks 98 are provided as reference targets for use of 
scanner 86 in precisely locating document metadata section 94 and encoded image 
96 on preserved document record 90 during decoding, as described subsequently. 

Referring to Figures 10a through lOd, there is shown an example of 
the human-readable data provided in document metadata section 94. It must be 
emphasized that Figures 10a through lOd are provided for example only, to 
illustrate some of the types of information that can be provided and one possible 
encoding scheme. In actual practice, the metadata that is provided in human- 
readable form is designed to best suit individual requirements of a data 
preservation application. 

In general, the metadata fields must be written in human-readable 
format. Text characters are typically used for encoding in a data format that is 
open, extensible, and self-defining, such as XML (Extensible Markup Language), 
for example. This human-readability allows portions of the document to be 
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scanned and automatically interpreted, for example, using scanner 86 with tools 
such as Optical Character Recognition (OCR). 

Figure 10a shows the overall structure of document metadata 
section 94 in a preferred embodiment. Encoded using XML, document metadata 
5 section 94 includes a header section 94h, followed by color channel sections 94c 1, 
94c2, and 94c3, one for each L*a*b* color channel. A terminating trailer section 
94t denotes the end of the file for metadata section 94. Figures 10b, 10c and lOd 
then show metadata fields for color channel sections 94cl, 94c2, and 94c3 
respectively. Each color channel section 94c 1, 94c2 and 94c3 gives information 
10 on bit positions used for encoding color channel data, on value ranges, and on 
mapping definitions for encoding and decoding values. Ellipses (. . .) indicate 
where lines have been removed for simplifying and abbreviating Figures 10a and 
10b. 

By way of illustration, Figure 10b shows how lightness L* values 
15 from 0 to 100 can be mapped to integers from 0 to 15, allowing the L* data to be 
encoded in a 4-bit data field 114. In the third mapping definition given, for 
example, minimum and maximum boundary values are listed as follows: 

<Channel_Value min="12" max="17"> 
Following this boundary value listing, an encoded value from 0-15 is defined for 
20 the range, as follows: 

<Encoded_Value>2</Encoded_Value> 
Then, a value for decoding is provided, showing the value that will be assigned, 
from the original range of 0 to 100, upon decoding of the encoded value, using the 
processing sequence described subsequently: 

25 <Decode_Value> 1 2</Decode_Value> 

From this simple, partial illustration, it can be seen that, for an 
image encoded using this mapping method, values originally in the range 12-17 
will be represented as value 12 when the document image is decoded and restored. 
There will be some loss of image quality; however, by selecting the mapping 

30 ranges carefully, a reasonably close approximation of the original document 
image can be preserved. 
Metadata about the Media 
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Referring to Figure 8, media imaging characteristics must also be 
provided in order to decode encoded information from any image 96 on the media 
roll 190. In a preferred embodiment, the function of preserving media imaging 
characteristics is performed by assigning one or more separate media metadata 
documents 194 to document positions on media roll 190. Note that media roll 190 
could be a roll of media or could be a cartridge, cassette, or other packaging unit. 
Information in media metadata document 194 that supports decoding operation 
could include any of the following, for example: 

(a) media calibration data or look-up tables; and 

(b) error-correction encoding information. 

Registration marks 98 are provided as reference targets for use of scanner 86 in 
precisely locating media metadata document 194 and a calibration patch 196 on 
media roll 190 for decoding, as described subsequently. 

In order for media metadata document 194 to be useful on any 
future hardware platform, the encoded data in media metadata document 194 must 
be in human-readable form. Referring to Figure 9, there is shown an example of a 
portion of the encoding of media metadata document 194 in the preferred 
embodiment. As shown in Figure 9, media metadata document 194 may include 
write and read calibration data for the preservation medium and characteristics for 
printer 92. 

In addition to the media metadata and image metadata components 
listed above, there can be additional metadata that is associated with the roll, 
cartridge, cassette, or other unit in which the preservation medium is packaged. 
This metadata can be provided within media metadata document 194 and may 
include information on media type, aging characteristics, directory or document 
tracking data, and other information, for example. 

Referring again to Figure 8, calibration patch 196 is also provided 
as part of the media metadata to allow calibration of a scanner for reading 
individual pixels of each image 96. In a preferred embodiment, calibration patch 
196 is provided along with metadata section 194. A number of alternatives are 
possible, including having calibration patch 196 associated with the individual 
image 96 or with a group of images 96. Calibration patch 196 could be arranged 
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in a simple format, establishing sample or index points along a curve of non-linear 
density vs. code value or, where density is linear with respect to a range of code 
values, establishing end-points of a line or line segment. Calibration patch 196 
could alternately include numeric annotation to label the intended data values 
corresponding to one or more densities reproduced in the set of samples. 

The contone image mapping method described above is somewhat 
lossy. That is, due to the approximation provided using histograms and statistical 
techniques, a color image that has been decoded and restored from its preserved 
document record 90 would not reproduce its original colors with precision in all 
cases. However, extensions of the embodiment described above could be used to 
improve storage for chroma as well as for lightness channels. For example, with 
12-bit resolution, data fields 1 14, 1 16, and 118 could be scaled to 3 or 4 bits, 
allowing additional gradation in chroma data as stored. With higher resolution, 
which means a larger bit depth, additional data could be encoded. The method of 
the present invention can be practiced given any reasonably high resolution, with 
data fields assigned and organized accordingly. As a general principle, 
increasingly more robust arrangements are possible when larger bit depths become 
available. 

Generalized Data Coupling to Document Image 

The example outlined above with reference to Figures 2 through 7 
was directed to the encoding of L*a*b* values in monochrome pixels. The same 
method could alternately be adapted for storing other types of information within 
grayscale levels, with selected data fields in any of a number of arrangements. 
With reference to Figure 6, for example, the visual appearance of an image could 
be preserved using first data field 1 14 for grayscale representation, while using 
second and third data fields 1 16 and 1 18, whether separately or combined, for 
storage of alternate information. For example, by combining second and third 
data fields 116 and 118, monochrome data words 120 for successive pixels could 
be used to store a sequence of audio bytes, with each monochrome data word 120, 
that is, each pixel; storing one half byte. 

The mapping method of the preferred embodiment could be altered 
in a number of different ways within the scope of the present invention. For 
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example, it might be desired to arrange fields differently for mapping L*a*b* 
values. In a particular application, there may be no advantage in printing an 
image with accurate monochrome representation; in such a case, L* values might 
be mapped to alternate fields within monochrome data word 120. Any 
5 arrangement of data fields could be used as an alternative to the structure shown in 
Figure 6. For example, third data field 1 18 or some additional data field could be 
assigned for image metadata, security information, authentication information 
such as a digital signature, error correction data, information about the overall 
document, or a reference to such information. The data stored in a data field 

1 0 could be encoded data or could be one part of a byte, word, or other data unit, 

where the individual parts of the data unit span multiple pixels. A data field could 
store data directly, or store a reference or pointer to data, such as a pointer to a 
color palette, for example. Fields in addition to data fields 1 14, 1 16, and 1 18 
could be assigned, for encoding additional data to be preserved in preserved 

1 5 document record 9 0 . 

Encoding Data Using Shadows/Highlights Regions 

Referring back to the density curve of Figure 4, it is instructive to 
observe that images are primarily represented using densities within high contrast 
region 124. In general, toe region 126, representing very low densities, and 

20 shoulder region 122, representing very high densities may be usable for data 

storage. This may mean using very dark or very light pixels within image 96 for 
storing encoded data, for example, where pixels above or below specific threshold 
densities are used primarily for data encoding. 
Alternate Mapping Schemes 

25 For preservation of color information, use of the CIELAB L*a*b* 

format is most favorable, since a lightness channel L* value easily maps to a 
corresponding grayscale value. However, data representation formats other than 
the tristimulus CIELAB L*a*b* format of the preferred embodiment can be used. 
For example, color data could be stored in CIELUV format, where tristimulus 

30 values represent brightness, hue, and saturation. Alternately, color data could be 
encoded in tristimulus RGB format, CMY (Cyan, Magenta, Yellow) format or in 
CMYK format (with added Black component). Or, color data could be encoded in 
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a proprietary tristimulus data format, such as in KODAK Photo YCC Color 
Interchange Space, for example. In order to store all of the component values for 
the selected color space, the rasterized data values to be encoded would have a 
large bit depth, such as 24 or 32 bits in some cases. Monochrome data word 120, 
however, into which the components of tristimulus and other formats would be 
encoded, would have a small bit depth, such as the 8-bit monochrome data word 
120 of Figure 6. The arrangement of fields within monochrome data word 120 
can be freely adapted to suit the encoding requirements for color accuracy. As 
with the L* channel information in the example of Figure 5, it may work best to 
map one component of color data using relatively more bits. For RGB color data, 
for example, it may be most effective to map green values to a 4-bit field, while 
mapping red and blue values, which may have less impact on some images, to 
smaller 2-bit fields. The values used in any field could be pointers to other values, 
such as the L*,a*, and b* channel values in first, second, and third data fields 
1 1 4, 1 1 6, and 1 1 8 of Figure 6. Or, these values could be sufficient in themselves, 
as might a 4-bit L* channel value stored in first data field 1 14. Overall, the 
methods of the present invention as disclosed herein could be used for mapping 
any type of color representation data format from one data structure to another. 

Images printed on preserved document record 90 could be positive 
or negative, with image density appropriately assigned for the preservation 
medium. 

Depending on factors such as image type, spatial resolution, and 
data bit depth available due to density resolution, any number of alternate 
mapping schemes could be implemented, including the following: 

(a) use of "guard bits". Deliberate assignment of guard bits as 
separators for data fields may help to more clearly distinguish 
encoded data values; and 

(b) use of neighboring values and relative offsets. A number of 
data representation schemes can be employed that extrapolate 
image values for a pixel from those of neighboring pixels or 
that provide only offsets from an averaged value. 

Decoding Process 
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As is true for data decoding in general, the procedure for extracting 
encoded data is the inverse of the corresponding procedure used to encode the 
data. Referring to the flow chart of Figure 11, there is shown a processing 
sequence for decoding data that has been encoded onto a monochrome medium to 

5 form image 96 having m pixels, using the method of the present invention. 

Referring to Figure 12, the process of Figure 1 1 uses scanner 86 for obtaining data 
from preserved document 90 and for obtaining any necessary supporting 
calibration and media-specific metadata from media roll 190. In a setup step 300, 
scanner 86 is used to obtain encoding parameters for use in decoding by control 

1 0 processing unit 88 . The necessary parameters include those provided in metadata 
section 94, media metadata document 194, and calibration patch 196. For 
dimensionally locating metadata section 94, media metadata document 194, and 
calibration patch 196 to be scanned using scanner 86, registration marks 98 or 
some other optical or mechanical locating means are used. For example, 

1 5 mechanical indexing means could be provided, allowing scanner 86 to pinpoint 
the precise location of metadata section 94. Human-readable content in metadata 
section 94, in media metadata document 194, and in labeling for calibration patch 
196 allow flexible options for accessing the setup data that is needed. While 
automatic scanning and interpretation using OCR utilities is preferred, even 

20 manual interpretation and entry of the necessary setup data for setup step 300 is 
possible, because metadata is provided in human-readable format. Within the data 
itself, specific fields provide registration markings that clearly delineate the 
beginning of each block of data. For example, an identifying preamble clearly 
marks the starting-point for the first field of metadata in metadata section 94. As a 

25 result of setup step 300, scanner 86 and control processing unit 88 have sufficient 
information to obtain the encoded information from image 96 and to perform the 
processing of subsequent steps in Figure 1 1 for extracting the encoded data 
values. 

A counter initialization step 302 and a counter increment step 304 
30 are provided to illustrate the mechanics of looping operation for processing each 
of m image pixels. For each pixel n, a grayscale read step 306 receives a 
grayscale value, grayscale_value„, from the scanner and stores this value in a data 
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word, monochrome_word„, which has sufficient bit depth for storing the grayscale 
value. Then, in a data field restoration step 308, a plurality of data fields are 
extracted from monochrome_word„. In this step, then, monochrome_word„ is 
decomposed into its component fields. As was described with reference to Figure 
6, a preferred embodiment of the present invention encodes three data fields 1 14, 
116, and 118 into monochrome data word 120. Then, to reverse the mapping 
operation shown in the example of Figure 5, an inverse mapping operation is 
executed to obtain, from monochrome data word 120, a composite raster value 
100 or other decoded composite data value or set of data values, based on the 
encoding scheme that was used. In the preferred embodiment, this composite data 
value is a tristimulus color image value using CIELAB color space, with 
component values of lightness (L*), a-chroma (a*) and b-chroma (b*). Finally, a 
looping decision step 310 determines whether or not each pixel has been scanned 
and its corresponding data processed. 

Once all pixels have been scanned and all data values extracted, it 
is then possible to reconstruct the original encoded document, and any data 
coupled to the document, from the data obtained. As was noted above, the 
encoding may or may not be lossy. 

Those skilled in the computing arts can readily recognize that the 
flow chart of Figure 1 1 illustrates only one possible implementation of image 
decoding for each pixel using a loop that utilizes the mechanics of steps 302, 304, 
and 310. Alternate logic flow sequences could be used. In practice, grayscale 
read step 306 would most likely read the grayscale value for some group of pixels 
or for all pixels in one operation, with the operations of data field restoration step 
308 performed once all pixels have been read. Monochrome_data_word n provides 
a convenient data structure for the sake of the example of Figure 1 1 ; however, a 
temporary register or other suitable structure could be used. Regardless of the 
exact processing mechanics, however, the basic data access and value extraction 
assignment and inverse mapping scheme outlined in steps 300, 306, and 308 of 
Figure 1 1 would be carried out in some fashion in order to implement the method 
of the present invention. The resulting data values obtained using the decoding 
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process could then be printed, displayed, stored, or used as input for further 
processing, for example. 

Those skilled in the art of encoding and decoding information will 
also observe that additional supporting steps could be provided to optimize the 
basic procedure illustrated in Figure 1 1 . For example, it may be useful to 
preprocess scanned image data in order to minimize the impact of noise, such as 
might be due to dust or scratches. Similarly, error checking and correction 
algorithms could also be applied in order to improve the accuracy of the scanned 
data obtained. 

It can be appreciated that the method of the present invention 
allows a flexible procedure for extracting data encoded within image pixels in a 
monochrome medium. The method of the present invention allows compact data 
storage that integrally couples data to an image and provides suitable techniques 
for obtaining and decoding the stored data at some time in the near or distant 
future. 

Variable Levels of Data Encoding and Decoding 

The method of the present invention also allows a provider of 
document preservation services the flexibility to offer variable levels of data 
encoding and decoding. For example, it may be desirable initially to encode a 
substantial amount of information with an image, using a relatively large bit depth 
with multiple fields. Then, in response to standard requests for decoding, it may 
be sufficient to provide only some of the data fields stored. Thus, for example, 
different levels of decoding could be made available, at different cost for each 
level of decoding request. As one example, referring to Figure 13, an image could 
be encoded with data at a bit level of 12 bits. Monochrome data word 120 might 
then comprise four data fields, with data encoded as follows: 

first data field 1 14, containing encoded L* value in 4 bits; 

second data field 116, containing encoded a* value in 2 bits; 

third data field 118, containing encoded b* value in 2 bits; and 

a fourth data field 119, containing encoded audio data value in 

4 bits. 

Then, using this model, the following levels of decoding could be available: 
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(a) a first level of decoding, in which only the luminance 
channel L* value from first data field 1 14 is provided; 

(b) a second level of decoding, in which a decoded L*a*b* 
value from first, second, and third data fields 1 14, 116, and 118 

5 are provided; and 

(c) a third level of decoding, in which the full L*a*b* value is 
provided plus the audio data value from fourth data field 119. 

Of course, alternate data levels could be provided for extracting 
data from preserved document record 90 in any combination of fields, such as, for 

1 0 example, decoding only the audio data value from fourth data field 119. In any 

case, it would be possible for a provider of digital preservation services to offer its 
subscribers varying levels of data record preservation, at different pricing, based 
on how much of the preserved information coupled with an image is needed. A 
provider of digital preservation services could then utilize low-cost scanner 86 

1 5 apparatus more effectively for responding to decoding requests that require only a 
portion of the image bit depth. Customer requests requiring the fully encoded data 
would then require scanners 86 having higher density resolvability levels. 

The invention has been described in detail with particular reference 
to certain preferred embodiments thereof, but it will be understood that variations 

20 and modifications can be effected within the scope of the invention. Therefore, 
what is provided is a method for decoding of data associated with an image on 
monochrome media. 
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